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Abstract.  Traditional  methods  for  the  analysis  of  system  performance 
and  reliability  generally  assume  a  precise  knowledge  of  the  system  and 
its  workload.  Here,  we  present  methods  that  are  suited  for  the  analysis 
of  systems  that  contain  partly  unknown  or  unspecified  components,  such 
as  systems  in  their  early  design  stages. 

We  introduce  stochastic  transition  systems,  a  high-level  formalism  for 
the  modeling  of  timed  probabilistic  systems.  Stochastic  transition  sys¬ 
tems  extend  current  modeling  capabilities  by  enabling  the  representation 
of  transitions  having  unknown  delay  distributions,  alongside  transitions 
with  zero  or  exponentially-distributed  delay.  We  show  how  these  various 
types  of  transitions  can  be  uniformly  represented  in  terms  of  nondeter¬ 
minism,  probability,  fairness  and  time,  yielding  efficient  algorithms  for 
system  analysis.  Finally,  we  present  methods  for  the  specification  and 
verification  of  long-run  average  properties  of  STSs.  These  properties  in¬ 
clude  many  relevant  performance  and  reliability  indices,  such  as  system 
throughput,  average  response  time,  and  mean  time  between  failures. 


1  Introduction 

The  analysis  of  system  performance  and  reliability  is  an  essential  part  of  the 
design  of  many  computing  and  communication  systems.  Most  approaches  to  the 
computation  of  performance  and  reliability  indices  presuppose  that  the  structure 
of  the  system  is  known  in  detail,  and  that  the  values  of  the  transition  probabilities 
and  the  delay  distributions  are  precisely  known.  Here,  we  describe  methods  that 
are  suited  to  the  evaluation  of  systems  that  are  still  in  their  early  stages  of 
design,  when  not  all  the  system  components  may  have  been  designed,  and  when 
relevant  quantities  may  be  known  only  with  some  approximation. 

We  introduce  stochastic  transition  systems  (STSs),  a  high-level  modeling  lan¬ 
guage  for  timed  probabilistic  systems.  Stochastic  transition  systems  provide  a 
concise  and  compositional  way  to  describe  the  behavior  of  systems  in  terms  of 
probability,  waiting-time  distributions,  nondeterminism,  and  fairness.  In  partic¬ 
ular,  the  execution  model  of  STSs  extends  that  of  generalized  stochastic  Petri 
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nets  [ABC84]  and  of  stochastic  process  algebras  such  as  TIPP  [GHR93],  PEPA 
[Hil96]  and  EMPA  [BG96]  with  the  introduction  of  nondeterminism  and  of  tran¬ 
sitions  with  unspecified  delay  distribution.  These  features  enable  the  modeling 
of  unknown  (or  imprecisely  known)  arrival  rates  and  transition  probabilities,  as 
well  as  the  modeling  of  schedulers  with  unspecified  behavior. 

We  provide  two  semantics  for  STSs.  The  first  one  is  an  informal  semantics 
that  can  be  used  to  gain  an  intuitive  understanding  of  STSs,  and  to  guide  the 
construction  of  system  models.  The  second  semantics  is  defined  by  providing  a 
translation  from  STSs  to  fair  timed  probabilistic  systems  (fair  TPSs) ,  a  low-level 
computational  model  based  on  Markov  decision  processes  that  is  well  suited  to 
the  application  of  verification  algorithms.  The  relation  between  an  STS  and  its 
translation  TPS  parallels  the  relation  between  a  first-order  transition  system 
and  its  representation  as  a  state-transition  graph;  in  particular,  the  state  space 
of  the  translation  TPS  coincides  with  that  of  the  STS.  We  show  that  the  trans¬ 
lation  precisely  captures  the  informal  semantics  of  STSs,  justifying  the  use  of 
the  informal  semantics  in  the  construction  of  system  models. 

The  translation  relies  on  a  new  notion  of  fairness  for  probabilistic  systems, 
called  probabilistic  fairness.  Unlike  previous  notions  of  fairness,  which  refer  to 
the  transitions  that  are  enabled  and  taken  along  system  behaviors  [Var85,  MP91, 
KB96] ,  probabilistic  fairness  is  a  structural  condition  on  the  policies  that  govern 
the  resolution  of  nondeterministic  choices.  The  condition  states  that,  for  every 
policy,  there  must  be  a  fixed  £  >  0  such  that  every  fair  alternative  is  selected  with 
probability  at  least  £.  Probabilistic  fairness  enables  the  faithful  representation 
of  transitions  with  unspecified  delay  distributions.  Probabilistic  fairness  also 
simplifies  the  analysis  of  several  algorithms,  since  its  basic  ingredients  — policies 
and  probability —  are  already  present  in  Markov  decision  processes. 

We  then  turn  our  attention  to  the  specification  and  verification  of  long-run 
average  properties  of  probabilistic  systems.  Long-run  average  properties  refer 
to  the  average  behavior  of  a  system,  measured  over  a  period  of  time  whose 
length  diverges  to  infinity.  In  a  purely  probabilistic  system,  these  properties  are 
related  to  the  steady-state  distribution  of  the  Markov  chain  corresponding  to  the 
system.  We  specify  long-run  average  properties  of  systems  by  attaching  labels  to 
the  system  states  and  transitions,  following  a  simplified  version  of  the  approach 
of  [dA98].  The  labels  specify  system  tasks,  whose  long-run  average  outcome  or 
duration  can  be  measured.  This  enables  the  specification  of  several  reliability 
and  performance  indices,  such  as  throughput,  average  response  time,  and  mean 
time  between  failures. 

Einally,  we  present  algorithms  for  verifying  that  the  performance  and  relia¬ 
bility  specifications  of  an  STS  are  met  even  under  the  most  unfavorable  combi¬ 
nation  of  nondeterministic  behavior  and  choice  of  delays  for  the  transitions  with 
unknown  delay  distributions.  The  verification  process  is  based  on  an  adaptation 
of  the  algorithms  presented  in  [dA98]  to  systems  that  include  fairness.  We  show 
that  the  presence  of  fairness  does  not  increase  the  complexity  of  the  verification 
problem,  which  can  again  be  solved  in  polynomial  time  in  the  size  of  the  fair 
TPS.  The  analysis  of  the  verification  algorithms  also  shows  that,  when  consider- 
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ing  long-run  average  properties  of  finite-state  systems,  our  notion  of  probabilistic 
fairness  yields  the  same  verification  algorithms  as  the  weak  fairness  of  [KB96] , 
showing  that  the  two  notions  are  equivalent  in  this  context. 

2  Stochastic  Transition  Systems 

Stochastic  transition  systems  (STSs)  have  been  inspired  by  the  fair  transition 
systems  of  [MP91]  and  by  the  real-time  probabilistic  processes  of  [ACD92].  A 
stochastic  transition  system  (STS)  is  a  triple  S  =  (V,  0,T),  where: 

—  V  is  a  finite  set  of  typed  state  variables,  each  with  finite  domain.  The  (finite) 
state  space  S  consists  of  all  type-consistent  interpretations  of  the  variables 
in  V.  We  denote  by  s|x]  the  value  at  state  s  G  5  of  x  G  V;  the  interpretation 
function  [-J  is  extended  to  terms  in  the  obvious  way. 

—  0  is  an  assertion  over  V  denoting  the  set  {s  G  5  |  s  |=  0}  of  initial  states. 

—  T  is  a  set  of  transitions. 

With  each  transition  t  £T  are  associated  the  following  quantities: 

—  An  assertion  Sj-  over  V,  which  specifies  the  set  of  states  {s  G  5  |  s  |=  on 
which  r  is  enabled. 

—  A  number  m^r  >  0  of  transition  modes.  Each  transition  mode  i  G  nir} 

corresponds  to  a  possible  outcome  of  r,  and  is  specified  by: 

•  A  set  of  assignments  {x'  :=  f[x}xev,  where  each  is  a  term  over  V. 
These  assignments  define  the  function  f[  :  5  i->-  S,  which  maps  every 
state  s  £  S  into  a  successor  s'  =  ff{s)  such  that  s'|x]  =  sl/Z^,]  for  all 
X  G  V. 

•  The  probability  G  [0, 1]  with  which  mode  i  is  chosen.  We  require 

ErjiP[  =  i- 

The  set  T  of  transitions  is  partitioned  into  the  two  subsets  Ti  and  Td  of  immediate 
and  delayed  transitions.  Immediate  transitions  must  be  taken  as  soon  as  they 
are  enabled.  A  subset  Tf  QTi  indicates  the  set  of  fair  transitions.  In  turn,  the 
set  Td  of  delayed  transitions  is  partitioned  into  the  sets  Te  and  7i,  where: 

—  7^  is  the  set  of  transitions  with  exponential  delay  distribution.  With  each 
T  £  Te  is  associated  a  transition  rate  >  0. 

—  Tu  is  the  set  of  transitions  with  unspecified  delay  distributions.  These  tran¬ 
sitions  are  taken  with  non-zero  delay,  but  the  probability  distribution  of 
the  delay,  and  the  possible  dependencies  between  this  distribution  and  the 
system’s  present  state  or  past  history  are  not  specified. 

Given  a  state  s  G  5,  we  indicate  by  T(s)  =  {r  G  T  |  s  |=  ^t}  the  set  of 
transitions  enabled  at  s.  To  insure  that  T{s)  7^  0  for  all  s  G  5,  we  implicitly  add 
to  every  STS  an  idle  transition  Tidie  G  T,  defined  by  Stmu  —  true,  TOridje  =  1; 
pTidu  _  =  1,  and  by  the  set  of  assignments  {x'  :=  xjx^v  The  choice  of 

an  unitary  transition  rate  is  arbitrary. 
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2.1  Informal  Semantics  of  Stochastic  Transition  Systems 

We  present  here  an  informal  semantics  of  STSs,  which  can  be  used  to  gain  an 
intuitive,  but  accurate,  understanding  of  their  behavior.  In  a  later  section,  we 
show  that  this  informal  semantics  precisely  corresponds  to  the  formal  semantics, 
defined  by  translation  into  lower-level  computational  models. 

In  the  informal  semantics,  the  temporal  evolution  of  the  system  state  is  repre¬ 
sented  by  a  timed  trace.  A  timed  trace  is  an  infinite  sequence  {so,Io),  ■■ 

of  pairs,  where  Ik  C  ]R+  is  a  closed  interval  and  Sk  is  a  system  state,  for  fc  >  0. 
The  intervals  must  be  contiguous,  i.e.  max  Ik  =  min  7^+1  for  all  k  >  0,  and 
the  first  interval  must  begin  at  0,  i.e.  min7o  =  0.  A  pair  {sk,Ik)  in  a  timed 
trace  indicates  that  during  the  interval  of  time  Ik  the  system  is  in  state  Sk  ■  The 
choice  of  considering  only  closed  intervals  is  arbitrary.  Note  that  point  intervals 
are  permitted:  they  represent  transitory  states  in  which  an  immediate  transition 
is  taken  before  time  advances.  These  transitory  states  are  very  similar  to  the 
vanishing  markings  of  generalized  stochastic  Petri  nets  (GSPNs)  [ABC84]. 

The  initial  state  so  of  a  timed  trace  must  satisfy  so  1=  0-  For  fc  >  0,  state  Sk 
determines  the  expected  duration  of  Ik  and  the  next  state  as  follows: 

—  Some  immediate  transition  enabled.  If  T{sk)  nTi  ^  then  the  duration  of 
Ik  is  0.  A  transition  r  G  T{sk)  Cl  Ti  is  chosen  nondeterministically,  subject 
to  fairness  requirements:  if  r  G  Tf,  then  r  must  be  chosen  with  non-zero 
probability. 

Once  r  has  been  chosen,  each  transition  mode  i  G  [l..m.r]  is  chosen  with 
probability  p[,  and  the  successor  state  is  given  by  =  fi{s). 

—  Only  delayed  transitions  enabled.  If  T{sk)  C  Td,  let  Te{sk)  =  T{sk)  H  Te 

and  Tu{sk)  =  T{sk)  OTu.  The  transition  rates  for  r  G  Te{s)  are  given;  we 
select  nondeterministically  >  0  for  r  G  7j((sfc).  The  expected  duration  of 
Ik  is  then  given  by  1/ X]TGT(sfc)  7’'’  transition  r  G  T{s)  is  chosen 

with  probability  y^/  Y^T'eTis^)  It'- 

Once  r  has  been  chosen,  each  transition  mode  i  G  [l..m.r]  is  chosen  with 
probability  p[,  and  the  successor  state  is  again  given  by  =  fj{s). 

Time  divergence.  In  our  definition  of  timed  trace,  we  have  not  ruled  out  the  pos¬ 
sibility  of  traces  along  which  time  does  not  diverge.  These  traces  can  arise,  since 
the  time  intervals  in  the  trace  can  be  point  intervals,  or  can  be  arbitrarily  small. 
In  a  later  section,  we  provide  a  method  for  checking  that  non-time-divergent 
traces  occur  with  probability  0. 

2.2  An  Example  of  STS 

As  a  simple  example  of  STS,  we  consider  a  model  for  a  system  consisting  of 
a  commuter  that  continually  travels  between  cities  A  and  B,  each  way  passing 
through  an  intermediate  city  C.  Cities  A  and  C  are  connected  by  highway  link  1, 
cities  C  and  B  by  link  2.  Each  link  can  be  in  good  conditions,  in  poor  conditions, 
or  undergoing  repair:  for  i  =  1,2,  the  state  of  link  i  is  represented  by  variable 
li,  with  domain  {g,p,r}.  For  each  link,  the  transition  from  good  to  poor  has 
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rate  7gp  =  0.05,  and  the  transition  from  repair  to  good  has  rate  'jrg  =  0.1.  The 
transition  from  poor  to  repair  has  unspecified  delay  distribution:  the  scheduling 
of  road  repairs  follows  criteria  that  are  not  known  to  the  layperson. 

The  commuter  can  be  at  one  of  4  states,  depending  on  which  segment  must  be 
traversed  next  and  in  which  direction.  The  state  of  the  commuter  is  represented 
by  variable  c,  with  domain  {1,  2,  3, 4}:  we  let  c  =  1  when  A  — >•  C  is  the  next  trip 
to  be  undertaken,  and  similarly  c  =  2  for  C  — >  B,  c  =  3  for  B  — >•  C,  and  c  =  4  for 
C  — >•  A.  Depending  on  the  conditions  of  the  next  link,  the  commuter  traverses 
the  link  with  rate  7g  =  0.5,  7p  =  0.3,  or  7^  =  0.1. 

The  STS  S  =  (V,  0,T)  has  variables  V  =  {c,li,l2}  and  initial  condition 
0  :  c  =  1  A  li  =  g  A  I2  =  g-  The  set  of  transitions  is  T  =  {rgp^i,  Tpr,i,  Trg,i}i=i,2  U 
{Tg,Tp,Tr},  where  transition  Tgp^i  models  link  i  going  from  good  to  poor,  tran¬ 
sition  Tg  models  the  commuter  traversing  a  good  link,  and  the  meaning  of  the 
other  transitions  can  be  analogously  inferred.  We  list  only  a  few  representative 
transitions;  the  others  are  similar.  For  brevity,  while  describing  transition  r  we 
write  £  instead  of  £t  ,  and  so  forth. 

—  For  i  =  1, 2,  Tgp^i  £Te  is  defined  by  £  :  It  =  g-,  and  7  =  0.05;  m  =  1;  =  1; 

and  /'  :=  p,  /3_j  :=  h-i,  c'  :=  c. 

—  For  i  =  1,2,  Tpr,i  €  Tu  is  defined  by  £  :  h  =  p-,  and  m  =  1;  =  1;  and 

:=  r,  I'z^i  ■■=  h-i,  c'  :=  c. 

—  Tg  £  Te  is  defined  by  £  :  [(c  =  1  V  c  =  4)  A  /i  =  5]  V  [(c  =  2  V  c  =  3)  A  ^2  =  <?]; 

and  7  =  0.5;  m  =  1;  =  1;  and  c'  :=  (c  mod  4)  -|- 1,  I'l  :=  /i,  :=  h- 

Alternatively,  consider  the  case  in  which  links  in  poor  conditions  are  scheduled 
for  repair  with  rate  at  least  0.1.  To  model  this  case,  it  is  possible  to  introduce 
additional  transitions  j  G  Te  for  i  =  1,2.  These  transitions  are  defined  like 
Tpr,i,  *  =  1)2,  except  that  they  have  rate  7  =  0.1.  More  complex  combinations  of 
exponential-delay  and  unspecified-delay  transitions  can  be  used  to  model  more 
general  types  of  partial  knowledge  about  transition  rates. 

2.3  Related  Models  for  Probabilistic  Systems 

Stochastic  transition  systems  are  related  to  several  other  models  for  probabilistic 
systems.  The  execution  model  of  STSs  is  related  to  that  of  generalized  stochas¬ 
tic  Petri  nets  (GSPNs)  [ABC84].  In  particular,  STSs  generalize  GSPNs  by  in¬ 
troducing  transitions  with  unspecified  delay  distributions,  and  by  introducing 
the  possibility  of  nondeterministic  choice  among  enabled  immediate  transitions. 
STSs  extend  in  a  similar  way  also  the  probabilistic  finite-state  programs  of  [PZ86] 
and  the  real-time  probabilistic  processes  of  [ACD92].  The  introduction  of  nonde¬ 
terminism  and  of  transitions  with  unspecified  delay  distributions,  and  the  capa¬ 
bility  to  deal  with  these  features  in  the  verification  process,  also  represents  an 
innovation  with  respects  to  probabilistic  process  algebras  for  performance  mod¬ 
eling,  such  as  TIPP  [GHR93],  PEPA  [Hil96]  and  EMPA  [BG96].  Probabilistic 
automata  [SL94,  Seg95]  are  another  model  that  has  been  proposed  for  prob¬ 
abilistic  real-time  systems.  Probabilistic  automata  are  more  closely  related  to 
timed  probabilistic  systems,  our  low-level  model  of  computation,  than  to  STSs. 
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3  Translating  STSs  into  Low-Level  System  Models 

The  formal  semantics  of  STSs  is  defined  by  translating  STSs  into  fair  timed  prob¬ 
abilistic  systems  (fair  TPSs),  a  low-level  computational  model  based  on  Markov 
decision  processes.  Besides  providing  us  with  a  formal  semantics  for  STSs,  the 
translation  is  also  used  in  the  verification  process,  since  the  verification  algo¬ 
rithms  will  be  applied  to  the  fair  TPSs  obtained  by  translating  the  STSs. 

3.1  Timed  Probabilistic  Systems 

A  Markov  decision  process  (MDP)  is  a  generalization  of  a  Markov  chain  in  which 
a  set  of  possible  actions  is  associated  with  each  state.  To  each  state-action  pair  is 
associated  a  probability  distribution,  used  to  select  the  successor  state  [Der70]. 
We  consider  a  fixed  set  of  typed  state  variables  V,  coinciding  with  the  variables 
of  the  STS.  An  MDP  U  =  {S,A,p)  consists  of  the  following  components: 

—  A  finite  set  S  of  states,  where  each  s  £  S  assigns  value  s|x]  to  each  x  G  V. 

—  For  each  s  £  S,  A(s)  is  a  non-empty  finite  set  of  actions  available  at  s. 

—  For  each  s,  t  G  5  and  a  G  A(s),  Pst{a)  is  the  probability  of  a  transition  from 
s  to  t  when  action  a  is  selected.  For  every  G  5  and  a  G  A(s),  we  have 
0  <  Pst{a)  <  1  and 

A  behavior  of  an  MDP  is  an  infinite  sequence  w  :  sodoSiGi  •  •  •  of  alternating 
states  and  actions,  such  that  Si  £  S,  Oi  £  A(sj)  and  Psi,si+i  (a*)  >  0  for  all  i  >  0. 
For  i  >  0,  the  sequence  is  constructed  by  iterating  a  two-phase  selection  process. 
First,  an  action  Oi  G  A(sj)  is  selected  nondeterministically;  second,  the  successor 
state  Sj+i  is  chosen  according  to  the  probability  distribution  ps.,si^i{a)-  A  timed 
probabilistic  system  (TPS)  U  =  {S,A,p,Sin,  time)  consists  of  an  MDP  {S,A,p), 
and  of  the  following  additional  components  [dA97a,  dA98]: 

—  A  subset  Sin  C  5  of  initial  states.  Each  behavior  of  U  must  begin  with  a 
state  in 

—  A  labeling  time  that  associates  to  each  s  £  S  and  a  G  A(s)  the  expected 
amount  of  time  time{s,a)  G  M'''  spent  at  s  when  action  a  is  selected. 

We  will  often  associate  with  an  MDP  or  TPS  additional  labelings;  the  labelings 
will  be  simply  added  to  the  list  of  components.  We  define  the  size  of  an  MDP  or 
TPS  n  to  be  the  length  (in  bits)  of  its  encoding,  where  we  assume  that  transition 
probabilities  are  encoded  as  the  ratio  between  integers. 

To  be  able  to  assign  probabilities  to  sets  of  behaviors,  we  need  to  specify 
the  criteria  used  to  choose  the  actions.  To  this  end,  we  use  the  concept  of  policy 
[Der70] ,  closely  related  to  the  adversaries  of  [SL94,  Seg95]  and  to  the  schedulers  of 
[Var85,  PZ86].  A  policy  p  is  &  set  of  conditional  probabilities  Qn{a  \  sqSi  ■  ■  ■  Sn), 
for  all  sequences  of  states  sqSi  ■  ■  -  Sn  G  5+  and  all  a  G  A(s„).  The  conditional 
probability  Qn{a  \  sqSi  ■  ■  ■  Sn)  is  the  probability  with  which  action  a  G  A(s„)  is 
chosen  after  the  system  has  followed  the  sequence  of  states  so^i  ■  ■  -  Sn-  For  all 
sequences  of  states  so^i  •  •  •  G  5+,  it  must  be  X]oGyi(s„)  I  •  •  •  s„)  =  1. 
Thus,  a  policy  can  be  both  history-dependent  and  randomized. 
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Under  policy  rj,  the  probability  of  a  transition  from  to  t  after  the  state 
sequence  so  •  •  • ««  is  thus  given  by  J2aeA{s„)  Psr,,t{a)  Qri{a  \  so  •  •  •  ««)■  A  policy 
T}  gives  rise  to  a  probability  distribution  over  the  set  of  behaviors  [Der70].  We 
write  Pr2(A)  to  denote  the  probability  of  event  A  when  policy  rj  is  used  from 
the  initial  state  s.  We  also  let  Xi  and  Yi  be  the  random  variables  representing 
the  i-th  state  and  the  i-th  action  along  a  behavior,  respectively.  We  say  that  a 
policy  T}  is  memoryless  if  Qjj{a  \  sqSi  ■  ■  ■  Sn)  =  Qtjia  \  Sn)  for  all  sequences  of 
states  sqSi  •  •  •  G  5+  and  all  a  G  A(s„). 

3.2  Probabilistic  Fairness 

Fairness  is  a  concept  that  has  been  introduced  in  the  context  of  non-probabilistic 
systems  to  model  the  outcome  of  probabilistic  choices  while  abstracting  from  the 
numerical  values  of  the  probabilities.  Notions  of  fairness  for  probabilistic  systems 
have  been  studied  in  [HSP83,  Var85]  and  more  recently  in  [KB96],  which  also 
present  model-checking  algorithms  for  probabilistic  systems  with  fairness. 

Given  an  MDP  U  =  (5,  A,p),  a  fairness  condition  F  for  iT  is  a  mapping  F 
that  associates  to  each  s  G  5  a  subset  F{s)  C  A(s).  The  intended  meaning  is 
that  the  choice  at  s  among  actions  in  F{s)  should  be  “fair.”  The  various  notions 
of  fairness  differ  in  the  way  in  which  this  “fairness”  is  defined.  According  to 
[KB96],  a  policy  t}  is  said  to  be  strictly  fair  (resp.  almost,  or  weakly,  fair)  if 
the  behaviors  that  arise  under  rj  all  satisfy  (resp.  satisfy  with  probability  1) 
the  following  condition:  whenever  a  behavior  visits  infinitely  often  a  state  s, 
each  action  in  F(s)  is  chosen  infinitely  often  at  s.  In  this  paper  we  introduce 
a  new  notion  of  fairness,  called  probabilistic  fairness.  Unlike  the  above  notion 
of  fairness,  the  definition  of  probabilistic  fairness  refers  directly  to  the  policies, 
rather  than  to  the  behaviors  that  arise  from  the  policies. 

Given  an  MDP  U  =  {S,A,p)  and  a  fairness  condition  F  for  iT,  we  say 
that  a  policy  t}  is  (probabilistically)  F-fair  if  there  is  £  >  0  such  that,  for  all 
n  >  0,  all  sequences  of  states  so,...,s„  G  5+,  and  all  a  G  F(s„),  we  have 
Q,/(a  I  So  •  •  •  Sn)  >  £■  The  set  of  T^-fair  policies  is  denoted  by  t]{F). 

Clearly,  if  a  policy  is  T^-fair  then  it  is  also  weakly  fair;  the  converse  is  not 
true  in  general.  In  the  above  definition,  e  can  depend  on  the  policy  rj,  but  cannot 
depend  on  the  past  sequence  so  •  ■  ■  Sn-i  of  states.  If  £  could  depend  on  the  past, 
then  probabilistic  fairness  would  not  imply  weak  fairness.  Later  we  will  prove 
that,  for  finite  TPSs  and  in  the  context  of  the  long-run  average  properties  we 
consider,  probabilistic  fairness  is  equivalent  to  weak  fairness.  This  equivalence 
does  not  hold  for  all  types  of  systems  and  properties. 

A  fair  TPS  U  =  {S,A,p,Sin,time,F)  consists  of  a  TPS  {S,A,p,Sin,time) 
and  of  a  fairness  condition  F  for  (5,  A,p). 

3.3  Translating  STS  into  Fair  TPS 

Given  an  STS  S  =  (V,  0,T),  its  translation  TPS  Ilg  =  {S,A,p,Sin,time,F) 
shares  the  same  state  space  5  of  5;  the  set  of  initial  states  is  Sm  =  {s  G  5  | 
s  1=  0}.  For  each  s  G  S',  the  other  components  of  Ilg  are  defined  as  follows, 
depending  on  whether  some  immediate  transition  is  enabled  at  s  or  not. 
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3.3.1  Some  immediate  transition  enabled.  Let  Ti{s)  =  T(s)  fl  Ti  be  the 

set  of  immediate  transitions  enabled  at  s,  and  assume  that  Ti{s)  7^  0.  In  this 
case,  we  let  A(s)  =  {a^  |  r  G  7i(s)},  where  action  represents  the  choice  of 
transition  r  at  s.  For  all  r  G  7i(s),  we  let  time{s,  Or)  =  0;  moreover,  action  is 
fair  at  s  iff  r  is  fair:  precisely,  £  T{s)  iff  r  G  7/,  for  all  r  G  7i(s). 

For  each  mode  1  <  i  <  m^,  action  leads  with  probability  to  state  fj  (s), 
except  that  if  two  or  more  modes  lead  to  the  same  state,  the  probabilities  are 
added.  Precisely,  for  all  f  G  5,  we  let  Pstiar)  =  (^)  =  where  d[a] 

is  1  if  a  is  true,  and  0  otherwise. 

3.3.2  No  immediate  transitions  enabled.  If  T(s)  C  7d,  we  let  Te{s)  = 
T{s)  n  Te  and  Tu{s)  =  T{s)  n  7i;  note  that  Te{s)  7^  0,  due  to  the  presence 
of  the  idling  transition.  We  let  ^(s)  =  {og}  U  {Ut  \  r  G  7i(s)}:  action  Og 
represents  the  choice  of  a  transition  with  exponential  distribution,  and  for  r  G 
Tu{s)  action  Ot  represents  the  choice  of  the  transition  r,  which  has  unspecified 
delay  distribution.  We  let  ^{s)  =  ^(s),  and  we  define  the  expected  times  of  the 
actions  by  time{s,ag)  =  1/ X]T'GTe(s)  7’''’  time{s,ar)  =  0  for  all  r  G  Tu{s). 

Moreover,  for  r  G  Tg{s)  let  Ps{t)  =  'Ir/'^r'eTe  In  other  words,  Ps{t)  is 
the  probability  that  r  is  selected  at  s,  conditional  to  the  fact  that  the  transition 
is  selected  from  Tg{s).  For  all  f  G  5  and  r  G  Tu{s),  the  transition  probabilities 
are  defined  by: 

TUt  TTIt 

Pst{ar)  =  Y^p]5[fJ{s)=t]  Pst{ag)=  X]pr(s)p[<l[/r(s)  =  f]  • 

*=1  reTeis)  i=l 

3.4  Non-Zeno  TPSs 

We  say  that  a  fair  TPS  is  non-Zeno  if  time  diverges  with  probability  1  along 
all  behaviors,  under  all  fair  policies.  Precisely,  11  =  {S,  A,p,  Sm,  time,  tF)  is  non- 
Zeno  iff  we  have  ^TA^i^'^^Qtime{Xk,Yk)  =  00)  =  1  for  all  s  G  Sm  and  all 
Tj  G  t]{F).  Since  behaviors  along  which  time  does  not  diverge  have  no  physical 
meaning,  we  only  consider  non-Zeno  TPSs:  after  translating  an  STS  into  a  fair 
TPS,  it  is  necessary  to  check  that  it  is  non-Zeno.  A  method  to  do  this  is  presented 
in  Section  6.  A  more  sophisticated  approach  to  the  problem  of  time  divergence, 
inspired  by  [Seg95],  is  discussed  in  [dA97a]. 

4  Translation  and  Informal  Semantics 

Even  though  the  formal  semantics  of  STSs  is  defined  by  translation  into  fair 
TPSs,  there  is  a  correspondence  between  the  proposed  translation  and  the  infor¬ 
mal  semantics  presented  in  Section  2.1.  This  correspondence  is  important  from 
a  pragmatic  point  of  view,  since  system  models  are  usually  constructed  with  this 
intuitive  semantics  in  mind.  We  justify  the  translation  in  three  steps,  considering 
first  the  structure  of  the  translation  TPS,  then  the  use  of  fairness,  and  lastly  the 
interaction  between  translation  and  specification  languages. 
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4.1  Structure  of  the  Translation  TPS 


To  understand  the  correspondence  between  the  translation  and  the  informal 
semantics,  consider  the  system  evolution  from  a  state  s.  If  there  are  immediate 
transitions  enabled  at  s,  the  correspondence  between  the  informal  semantics  and 
the  structure  of  the  translation  TPS  is  immediate. 

If  T{s)  C  Td,  let  as  before  Te{s)  =  T{s)  n  Te  and  Tu{s)  =  T{s)  n  Tu-  The 
set  of  available  actions  at  s  is  {oe}  U  {ot  \  r  G  Tu{s)}.  Let  Qe  and  Qt,  for 
r  G  Tu{s),  be  the  probabilities  with  which  these  actions  are  chosen  by  a  policy. 
Note  that  and  Qt  can  depend  on  the  past  history  of  the  behavior.  There  is 
a  relation  between  the  probabilities  Qe  and  qr,  r  G  Tu,  selected  by  the  policy, 
and  the  rates  of  the  transitions  in  Tu{s),  selected  nondeterministically  in  the 
informal  semantics.  To  derive  the  relation,  consider  the  probability  of  choosing 
T  £T{s)  in  the  translation  TPS  and  in  the  informal  semantics.  In  the  TPS,  this 
probability  is  equal  to  qr  for  r  G  Tu{s),  and  to  qePris)  for  r  G  Te{s).  In  the 
informal  semantics  this  probability  is  equal  to  'Jr  /  eT{s)  T'  for  all  r  G  T{s). 
Equating  these  probabilities,  we  obtain 


««=(  Y.  ^^')  /  {  Y  T')  9r=7r/(  Y  T')  (1) 

r'GTeis)  r'Gr(s)  r'Gr(s) 

for  all  r  G  Tu{s).  This  relation  between  q^,  {9t}tgT„(s)  and  {7t}tG7;(s)  preserves 
not  only  the  probabilities  of  the  transitions  from  s,  but  also  the  expected  time 
spent  at  s.  In  fact,  from  Section  3.3.2  the  expected  time  spent  by  the  TPS 
at  s  is  equal  to  ^e/X^r'eTeis) substitute  into  this  equation  the  value 
of  qe  given  by  (1),  we  obtain  1/ X]t'gT(s) 'I'’"'’  which  is  exactly  the  expected 
time  spent  at  s  under  the  informal  semantics.  Thus,  equations  (1)  together  with 
the  constraint  qe  +  XtgT„(s)  =  1  define  a  one-to-one  mapping  between  the 
unspecified  transition  rates  in  the  informal  semantics  and  the  probabilities  of 
choosing  the  actions  in  the  translation  TPS.  The  mapping  preserves  both  the 
expected  time  spent  at  s,  and  the  probabilities  of  transitions  from  s.  Given  a 
nondeterministic  choice  for  the  transition  rates  {7t}tg7;(s)!  we  can  determine  a 
policy  which  simulates  this  choice;  conversely,  each  policy  can  be  interpreted  as 
a  choice  for  these  rates.  This  correspondence  indicates  that  the  translation  from 
STSs  to  fair  TPS  preserves  the  informal  semantics  of  STSs. 


4.2  Translation  and  Fairness 

The  above  considerations  also  justify  our  use  of  fairness  in  the  translation.  In  fact, 
for  r  G  Tu{s)  the  fairness  of  Ot  requires  that  qr  >  0,  which  by  (1)  corresponds  to 
the  requirement  >  0.  Similarly,  the  fairness  of  Ue  requires  that  qe  >  0,  which 
corresponds  to  the  requirement  7r  <  oo  for  all  r  G  Tu{s).  Thus,  the  fairness 
conditions  and  the  notion  of  probabilistic  fairness  are  the  exact  counterpart  of 
the  requirements  0  <  <  oo  for  the  rates  of  transitions  t  £  Tu- 
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4.3  Translation  and  Specification  Language 

In  Section  3.3.2  we  assign  expected  time  0  to  the  actions  that  correspond  to 
transitions  with  unspecified  rates.  The  argument  presented  above  to  justify  this 
assignment  is  valid  only  if  we  assume  a  restriction  on  the  expressive  power  of 
specification  methods.  Precisely,  we  allow  specification  methods  to  refer  to  the 
amount  of  time  spent  at  a  state,  but  we  require  that  they  do  not  measure  this 
amount  of  time  conditional  on  the  successor  state. 

To  clarify  this  point,  consider  as  an  example  a  state  s  of  an  STS  on  which 
two  transitions  are  enabled:  a  transition  n  with  rate  71,  leading  to  state  ti, 
and  a  transition  T2  with  unspecified  rate,  leading  to  state  ^2-  The  translation 
we  proposed  would  be  inappropriate  if  our  specification  methods  could  express 
properties  like:  “the  time  spent  at  s  when  ti  is  the  immediate  successor  is  on 
average  >  b.”  In  fact,  for  the  purposes  of  this  property  the  choice  of  assigning 
time{s,  Ot^)  =0  would  not  correspond  to  the  idea  of  assigning  nondeterministi- 
cally  a  transition  rate  to  T2.  On  the  other  hand,  if  the  specification  methods  can 
refer  only  to  the  expected  time  spent  at  s,  regardless  of  the  successor  of  s,  then 
the  translation  is  faithful  to  the  informal  semantics.  The  specification  methods 
discussed  in  the  next  section  obey  this  restriction. 

5  Specification  of  Long-Run  Average  Properties 

The  long-run  average  properties  we  consider  in  this  paper  refer  to  the  average 
outcome  of  a  task,  measured  over  an  interval  of  time  whose  length  diverges 
to  infinity.  A  task  is  a  (hopefully)  finite  activity  performed  regularly  by  the 
system.  The  outcome  of  the  task  can  depend  both  on  its  completion,  and  on  its 
duration.  For  example,  a  task  might  consist  in  sending  a  message  and  waiting  for 
the  acknowledge;  its  outcome  might  be  1  if  the  acknowledge  is  received,  or  0  if  a 
timeout  occurs.  The  long-run  average  outcome  of  this  task  is  equal  to  the  long- 
run  average  fraction  of  messages  that  are  acknowledged.  In  [dA97a],  tasks  were 
specified  using  labeled  graphs  called  experiments.  Here,  we  follow  a  simplified 
approach,  and  given  a  fair  TPS  U  =  {S,  A,p,  Sm,  time,  tF),  we  specify  tasks  and 
their  outcomes  using  two  labelings  w  and  r: 

—  The  labeling  tn  :  5  x  5  i->-  {0, 1}  associates  to  each  s,t  £  S  &  label  w{s,t), 
which  has  value  1  if  the  transition  s  — >•  f  completes  a  task,  and  0  otherwise. 

—  The  labeling  r  :  S  x  x  S'  i->-  IT*"  is  used  to  define  the  outcome 

of  a  task.  Due  to  the  restrictions  on  specification  languages  mentioned  in 
Section  4.3,  we  consider  only  labelings  that  can  be  written  in  the  form 

r{s,  a,  t)  =  a{s)  time{s,  a)  +  j3{s,  t) 

for  some  functions  a  :  S  ^  IR“'“  and  jd  :  S  x  S  ^  IR“'“  (where  IR“'“  =  {x  G 
IR  I  X  >  0}).  Thus,  the  labeling  r  can  be  used  to  measure  the  expected  time 
spent  at  system  states,  weighted  by  a  function  a;  the  “cost”  associated  to 
system  transitions,  expressed  by  jd]  or  a  combination  of  the  two. 
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In  GSPN  reward  models  [CMT91]  it  is  possible  to  associate  a  reward  rate  to  the 
places  and  transitions  of  the  net;  [Cla96]  and  [Ber97]  propose  methods  for  asso¬ 
ciating  a  reward  with  each  state  of  the  Markov  chain  generated  from  a  PEPA  or 
EMPA  model.  The  r  labeling  discussed  above  serves  a  similar  purpose;  however, 
we  also  introduce  the  notion  of  task,  and  the  corresponding  w  labeling.  Eor  sys¬ 
tems  that  can  be  translated  into  ergodic  Markov  chains,  the  two  approaches  are 
equally  expressive:  even  without  a  w  labeling,  the  average  outcome  of  a  task  can 
be  measured  by  measuring  separately  the  rates  of  task  completion  and  of  output 
generation,  and  by  taking  the  ratio  between  the  two.  In  the  case  of  systems  with 
nondeterministic  behavior,  however,  our  approach  leads  to  more  expressive  spec¬ 
ification  methods.  In  fact,  in  these  systems  the  choice  of  policy  may  influence 
differently  the  task  completion  rate  and  the  outcome  generation  rate.  Thus,  the 
ratio  between  the  maximal  outcome  generation  rate  and  the  minimal  task  com¬ 
pletion  rate  is  in  general  not  equal  to  the  maximal  long-run  average  outcome  of 
a  task.  Erom  the  r,  w  labelings,  for  each  behavior  w  of  H  we  define  a  predicate 
I  and  a  quantity  JJ„,  for  n  >  0,  as  follows: 

I  iff  3  k  .  [r{Xk,Yk,Xk+i)  >  OW  w{Xk,Xk+i)  >  0] 

jj  _  EkZlriX,,Y,,X,+,) 


(2) 

(3) 


In  (2),  the  notation  3  fc  is  an  abbreviation  for  “there  are  infinitely  many  dis¬ 
tinct  values  for  fc”.  Thus,  I  holds  if  u)  completes  infinitely  many  tasks,  or  if  one 
such  tasks  produces  infinite  outcome.  The  quantity  represents  the  average 
outcome  per  task  for  the  first  n  steps  of  ui.  Eor  all  s  G  5  and  all  policies  rj,  we 
let 

Hits)  =  inf|a  G  M  I  Pr^ (7  A  lim inf  <  a)  >  0}  (4) 

be  the  infimum  of  the  set  of  long-run  average  outcomes  obtained  with  non-zero 
probability  by  behaviors  that  satisfy  7.  We  do  not  consider  behaviors  on  which 
7  is  false,  since  these  behaviors  after  a  certain  position  cease  to  complete  tasks 
or  to  produce  outcome,  and  the  long-run  average  outcome  is  consequently  not 
well-defined:  this  point  is  discussed  in  detail  in  [dA97a,  dA98].  Einally,  we  let 


Hyr{s)  =  inf  H  (s) 


J7^(s)  =  sup  (s)  . 

■nevi^) 


The  quantities  Hjr{s)  and  H^{s)  represent  the  minimal  and  maximal  long-run 
average  outcomes  that  can  be  achieved  with  non-zero  probability  by  7-behaviors, 
provided  that  the  long-run  average  outcome  is  well-defined,  and  that  a  7^-fair 
policy  is  used  from  s.  The  specification  of  long-run  average  properties  of  STSs 
and  fair  STSs  is  based  on  the  specification  of  lower  (resp.  upper)  bounds  for 
Hyr{s)  (resp.  H^{s)),  for  some  states  s  G  S'. 

As  an  example,  consider  the  commuter  system  of  Section  2.2.  Eor  all  s,  t  G  S, 
we  let  w{s,t)  =  1  if  s|c]  =  4  and  t|c]  =  1,  and  w{s,t)  =  0  otherwise,  so  that  w 
counts  the  number  of  returns  to  city  A.  Eor  all  s  G  S  and  a  G  A(s),  we  also  let 


11 


r{s,  a)  =  time{s,  a)  if  s|c]  G  {1, 2},  and  r{s,  a)  =  0  otherwise,  so  that  r  measures 
the  time  spent  going  from  A  to  B.  With  these  labelings,  H^{s)  is  equal  to  the 
maximal  long-run  average  duration  of  a  one-way  trip  from  city  A  to  city  B,  if 
the  system  is  initially  at  s  (it  can  be  shown  that  Hjr{s)  does  not  depend  on  s  in 
this  system).  Using  the  algorithm  presented  in  Section  6,  we  can  compute  that 
H^{s)  ~  7.5526. 

6  Verification  of  Long-Run  Average  Properties 

The  verification  problem  for  long-run  average  properties  consists  in  computing 
Hyr{s),  Hjr{s)  at  all  states  s  G  5  of  a  fair  TPS.  Algorithms  that  solve  this  ver¬ 
ification  problem  for  the  case  without  fairness  conditions  have  been  presented 
in  [dA97a,  dA98].  To  solve  the  model-checking  problem  in  presence  of  fairness 
conditions,  we  first  decompose  the  fair  TPS  into  the  components  where  a  behav¬ 
ior  can  reside  forever  under  a  fair  policy.  These  components  are  called  fair  end 
components,  and  are  presented  below.  Once  the  TPS  has  been  decomposed,  we 
apply  to  each  component  the  algorithm  of  [dA98]  to  compute  the  maximal  and 
minimal  long-run  average  outcome  for  the  component,  disregarding  the  fairness 
conditions.  These  maximal  and  minimal  values  correspond  to  optimal  and  pes¬ 
simal  policies,  which  need  not  be  fair.  Nevertheless,  using  results  on  parametric 
Markov  chains  we  show  that  we  can  approximate  these  policies  with  a  series  of 
fair  policies,  whose  long-run  average  outcome  converges  to  that  of  the  optimal 
and  pessimal  policies.  This  shows  that,  for  each  component,  the  maximal  and 
minimal  long-run  average  outcomes  computed  disregarding  fairness  conditions 
also  apply  to  the  case  with  fairness.  Hence,  the  values  of  Hyr{s)  and  Hjr{s)  at 
a  state  s  can  be  obtained  by  taking  the  maximum  and  minimum  values  of  the 
long-run  average  outcome  computed  for  any  component  reachable  from  s. 

6.1  Fair  End  Components 

Given  an  MDP  11  =  {S,A,p),  a  sub-MDP  is  a  pair  {C,D),  where  CCS  and 
is  a  function  that  associates  to  each  s  G  C  a  subset  D{s)  C  A(s)  of  actions. 
The  sub-MDP  corresponds  thus  to  a  subset  of  states  and  actions  of  the  original 
MDP.  We  say  that  a  sub-MDP  {C,D)  is  contained  in  a  sub-MDP  {C',D')  if 
{(s,  a)  I  s  G  C  A  a  G  D(s)}  C  {(s,  a)  |  s  G  C"  A  a  G  D'(s)}. 

Given  a  fairness  condition  tF  for  11 ,  we  say  that  sub-MDP  (C,  D)  is  a  fair 
end  component  (FEC)  if  the  following  conditions  hold  [dA97a]: 

—  Closure:  for  all  s  G  C,  a  G  D(s),  and  t  G  5,  if  Pst{a)  >  0  then  t  €  C. 

—  Connectivity:  Let  E  =  {{s,t)  G  C  x  C  |  3a  G  D{s)  .pst{o)  >  O}.  The  graph 

{C,E)  is  strongly  connected. 

—  Fairness:  For  all  s  G  C,  we  have  F{s)  C  D{s). 

We  say  that  a  FEC  {C,D)  is  maximal  if  there  is  no  other  FEC  {C ,D')  that 
properly  contains  (C,  D).  We  denote  by  MFEC(I7,  the  set  of  maximal  FECs 
of  n.  The  set  MFEC(iT,  tF)  can  be  computed  in  time  polynomial  in  X^sgs 
using  simple  graph  algorithms;  an  algorithm  to  do  so  is  given  in  [dA97a,  §8]. 
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Intuitively,  a  fair  end  component  is  a  portion  of  MDP  consisting  of  the  states 
and  actions  that  can  be  visited  infinitely  often  by  a  behavior  with  positive  prob¬ 
ability,  under  some  fair  policy.  To  make  this  concept  precise,  given  a  behavior  u 

OO 

we  let  (C,  D)  =  inft(w)  be  the  sub-MDP  defined  by  C  =  {s  1 3  fc .  Xj,  =  s}  and, 

OO 

for  s  G  C,  D{s)  =  {a  1 3  k  .  Xk  =  s  /\Yk  =  a}. 

Theorem  1  For  any  s  £  S  and  rj  £  rjjr,  we  have  Pr^  (inft(w)  is  a  FEC)  =  1. 

In  a  purely  probabilistic  system,  fair  end  components  correspond  to  the  closed 
recurrent  classes  of  the  Markov  chain  underlying  the  system  [KSK66].  Fair  end 
components  are  the  fair  counterpart  of  the  end  components  of  [dA97a,  dA98], 
and  are  related  to  sets  used  in  [KB96]  to  solve  the  model-checking  problem  for 
PBTL*.  As  our  first  application  of  the  above  theorem,  we  obtain  a  criterion  to 
decide  whether  a  fair  TPS  is  non-Zeno. 

Theorem  2  (condition  for  non-Zenoness)  Given  a  fair  TPS  FI  = 
{S,A,p,Sin,tinie,F),  aFEC{C,D)  is  a  zero-EEC  if  tinie{s,a)  =  0  for  all  s  £  C 
and  a  £  D{s).  TPS  IT  is  non-Zeno  iff  there  is  no  zero-EEC  reachable  from  Sin- 

Even  though  there  can  be  exponentially  many  zero-FECs  in  a  fair  TPS,  it  is 
easy  to  see  that  it  suffices  to  consider  the  maximal  ones.  Hence,  checking  non- 
Zenoness  can  be  done  in  time  polynomial  in  X^sgs  I^(^)I  [dA97a,  §8]. 

6.2  Parametric  Markov  Chains 

Given  a  finite  set  S  of  indices,  a  substochastic  matrix  is  a  matrix  P  =  [pst]s,tGS 
such  that  0  <  Psi  <  1  for  all  s,t  £  S  and  1  s  £  S.  Given 

a  sub-stochastic  matrix  P,  the  steady-state  distribution  matrix  is  defined  by 
P*  =  lim„_,.oo  [KSK66].  We  say  that  a  state  of  P  is  surely  recurrent 

if  the  Markov  chain  corresponding  to  P  has  only  one  closed  recurrent  class,  and 
if  the  state  belongs  to  that  class.  The  following  result  can  be  proved  by  linear 
algebra  arguments  [dA97a,  §8]. 

Theorem  3  (continuity  of  steady-state  distributions)  Consider  a  family 
P{x)  =  [pst(a:)]s,tGS  of  substochastic  matrices  parameterized  by  x  £  I,  where 
/CM  is  an  interval  of  real  numbers.  Assume  that  the  coefficients  of  P{x) 
depend  continuously  on  x  for  x  £  I.  If  there  is  s  £  S  that  is  surely  recurrent  for 
all  X  £  I,  then  also  the  coefficients  of  P*{x)  depend  continuously  on  x  forx  £  I. 

6.3  The  Mo  del- Checking  Algorithm 

From  the  definitions  of  Hjr{s)  and  H^{s),  we  see  that  these  quantities  depend 
only  on  the  states  and  actions  that  are  repeated  infinitely  often.  Theorem  1 
states  that  these  states  and  actions  form  a  FEC  with  probability  1:  hence,  we 
can  concentrate  our  attention  on  the  maximal  FECs.  We  say  that  an  MDP  is 
strongly  connected  if,  for  each  pair  of  states,  there  is  a  behavior  prefix  that  leads 
from  one  state  to  the  other.  By  definition,  maximal  FECs  are  strongly  connected 
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sub-MDPs.  Denote  by  0  =  As.0  the  empty  fairness  condition.  The  following 
theorem  summarizes  several  results  of  [dA97a,  §5]  for  strongly  connected  MDPs 
without  fairness  conditions. 


Theorem  4  Consider  a  strongly  connected  TPS  U  =  {S,A,p,r,w).  The  fol¬ 
lowing  assertions  hold. 


—  The  value  of  (s)  does  not  depend  on  s  £  S.  The  common  value  JJg  can 


he  computed  in  time  polynomial  in  the  size  of  11. 
There  is  a  memoryless  policy  t}  such  that 


S.  Moreover,  the  transition  matrix  Pr, 


SoGj4(s)  I 

closed  recurrent  class. 


^•n  (^)  ~  -‘-‘0 

=  [plUes  defined 


for  all 
by 


s  G 
Prf  = 


s)  corresponds  to  a  Markov  chain  having  a  single 


Similar  assertions  hold  for  (s). 


Using  the  results  of  the  above  theorem,  we  propose  the  following  algorithms 
for  the  computation  of  Hyr{s)  and  Hjr{s). 

Algorithm  1  (computation  of  Hjr{s)  and  H^{s))  Given  a  fair  TPS  IT  = 
{S,A,p,Sin,time,tF)  together  with  labelings  r,  w,  the  quantities  Hjr{s)  and 
H^{s)  can  be  computed  at  all  s  G  5  as  follows. 

1.  Let  £  =  {{C,D)  G  MPEG (iT,J=')  \  3s,t  €  C  .3a  €  D{s)  .  [r{s,a,t)  > 
0\/w{s,t)  >  0]}  be  the  set  of  maximal  FECs  that  contain  at  least  one  instance 
of  strictly  positive  r  oi  w  label.  Write  £  =  {(Gi,  £>i), . . . ,  (G„,  £•„)}. 

2.  For  each  1  <  i  <  n,  let  Ifj  =  {Ci,Di,p^,tFi,ri,Wi),  where  p*,  Pi,  ri,  Wi  are 
the  restrictions  of  p,  Pi,  r,  w  to  Ct,  Di,  ioi  1  <  i  <  n.  Using  Theorem  4, 
compute  the  values  H^.,  for  all  MDPs  iTj,  for  1  <  i  <  n. 

3.  For  each  s  £  S,  let  K{s)  =  {i  G  [l..n]  |  s  can  reach  Ci}  be  the  set  of 
indices  of  maximal  FECs  reachable  from  s.  Then,  Hyr{s)  =  minj£j^(s)  ■ 
and  H^(s)  =  miniej^(,)  H^..  I 

Theorem  5  Algorithm  1  correctly  computes  Hjr{s)  and  H^(s). 


Proof  (sketch).  The  crux  of  the  argument  is  to  show  that  in  a  strongly  con¬ 
nected  MDP  the  equality  Hjr{s)  =  JLg  holds  for  all  s  (and  similarly  for  Hjr{s)). 
Once  this  is  done,  the  decomposition  in  maximal  FECs  (Step  1)  is  justified  by 
Theorem  1,  and  the  selection  of  the  maximal  FECs  that  contain  at  least  one 
positive  r,  w  label  is  justified  by  (2),  (3)  and  (4).  Finally,  Step  3  can  be  justified 
using  simple  reachability  arguments. 

To  show  that  in  a  strongly  connected  MDP  IT  =  {S,  A,p,  P,r,w)  we  have 
Hyr{s)  =  Hg^  for  all  s,  it  suffices  to  show  that  Hyr{s)  =  H^,{s),  where  p*  is 
the  policy  described  in  Theorem  4.  To  this  end,  let  p*  be  the  memoryless  iF-fair 
policy  that  at  each  s  £  S  chooses  uniformly  at  random  an  action  from  A(s).  For 
each  0  <  X  <  1,  define  the  memoryless  policy  p{x)  by 

Q7,{x)  {a\s)  =  X  Qr,-  (a  I  s)  -I-  (1  -  x)  Qr,*  (a  |  s) 
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for  all  s  and  all  a  G  ^(s).  Note  that  policy  r}{x)  is  J^-fair  for  0  <  x  <  1,  and 
it  coincides  with  t}*  for  x  =  0.  Let  P{x)  =  [psi(x)]s^i£5  be  the  matrix  of  the 
Markov  chain  arising  from  t}{x),  defined  by 

Pst{x)  =  ^  Qr,{x){a  I  s)pst{a)  , 

aG-A(s) 


and  let 

r8{x)=  ^  s)pst{a)r{s,a,t)  Ws{x)  ='^Pst{x)'w{s,t)  , 

oGyi(s)  tes  tes 

for  all  s  G  5  and  0  <  x  <  1.  Denote  by  P*{x)  =  [p*i]s,tGS  the  steady-state 
distribution  matrix  corresponding  to  P{x).  By  our  choice  of  p*  (see  Theorem  4), 
the  Markov  chain  corresponding  to  P(0)  has  a  single  closed  recurrent  class  C  C 
S.  Since  the  MDP  is  strongly  connected,  by  definition  of  p{x)  all  states  of  C 
are  surely  recurrent  for  0  <  x  <  1.  Hence,  as  a  consequence  of  standard  facts 
on  Markov  chains  we  have  Theorems 

ensures  that  lim3,_,.o  P*(x)  =  P*(0).  Since  for  all  s  G  5  quantities  rs(x)  and 
Ws{x)  are  also  continuous  for  x  — >•  0,  we  have  lima,_,.o  (s)  =  .  From  Hg  < 

Hjr{s)  and  from  the  fact  that  p{x)  is  P-fair  follows  Hyr{s)  =  inf^£7j(;F)  (s)  = 

,  as  was  to  be  proved.  I 

The  complexity  of  the  model-checking  problem  is  given  by  the  following  re¬ 
sult,  which  is  an  immediate  consequence  of  Theorem  4  and  Algorithm  1. 

Theorem  6  The  complexity  of  the  model- checking  problem  for  Hjr{s),  H^{s) 
is  polynomial  in  the  size  of  the  translation  TPS. 

We  conclude  by  showing  that  the  notions  of  weak  fairness  and  probabilistic 
fairness  coincide  for  finite  TPSs  and  long-run  average  properties. 

Theorem  7  Let  Hjr{s)  =  inf^^^^jp)  (s),  where  f){P)  is  the  set  of  weakly 
fair  policies,  defined  according  to  [KB96].  Then,  Hjr{s)  =  Hjr{s).  A  similar 
result  holds  for  H^{s). 

Proof.  The  result  follows  from  an  analysis  of  the  proof  of  Theorem  5,  together 
with  the  observation  that  probabilistically  fair  policies  are  also  weakly  fair.  I 
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